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Abstract 

A two-terminal interactive function computation problem with alternating messages is studied within the framework of 
distributed block source coding theory. For any arbitrary fixed number of messages, a single-letter characterization of the 
sum-rate-distortion function was provided in previous work using traditional information-theoretic techniques. This, however, 
does not directly lead to a satisfactory characterization of the infinite-message limit, which is a new, unexplored dimension for 
asymptotic-analysis in distributed block source coding involving potentially infinitesimal-rate messages. This paper introduces 
a new convex-geometric approach to provide a blocklength-free single-letter characterization of the infinite-message sum-rate- 
distortion function as a functional of the joint source pmf and distortion levels. This characterization is not obtained by taking 
a limit as the number of messages goes to infinity. Instead, it is in terms of the least element of a family of partially-ordered 
marginal-perturbations-concave functionals defined by the coupled per-sample distortion criteria. For computing the Boolean 
AND function of two independent Bernoulli sources at one and both terminals, with zero Hamming distortion, the respective 
infinite-message minimum sum-rates are characterized in closed analytic form. These sum-rates are achievable using infinitely 
many infinitesimal-rate messages. The convex-geometric functional viewpoint also suggests an iterative algorithm for evaluating 
any sum-rate-distortion function, including, as a special case, the Wyner-Ziv rate-distortion function. 

I. Introduction 

In this paper we study a two-terminal interactive function computation problem with alternating messages (Fig. [TJ within 
a distributed block source coding framework. Here, n samples of one component of a discrete memoryless multi-source 
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Fig. 1 . Interactive distributed source coding with t alternating messages. 

X := X" := (X{1), . . . ,X(n)) e X n are available at terminal A and n samples of another component of the multi-source 
Y e J/" are available at a different terminal B. The two component sources of the multi-source are statistically dependent. 
Terminal A is required to produce a sequence Za € 3Z" A such that cr A (X, Y, Za) < Da where erf* is a distortion function of 3« 
variables. Similarly, terminal B is required to produce a sequence Zb £ X" B such that rf^ (X, Y, Zb) < Db- All alphabets and 
distortion functions are assumed to be finite. To achieve the desired objective, t coded messages, Mi, ... , M,, of respective 
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bit rates (bits per source sample), Ri,...,R t , are sent alternately from the two terminals starting with some terminal. The 
message sent from a terminal can depend on the source samples at that terminal and on all the previous messages (which 
are available to both terminals). There is enough memory at both terminals to store all the source samples and messages. 
After t messages, terminal A produces a sequence Z A e X" A and terminal B produces a sequence Zb € 7H B - The sum- 
rate-distortion function R sum ,t(PA, Ab) is the infimum of the sum of all rates £i=i-R; for which the following criteria hold: 
¥(d ( f(X,Y,Z A ) > D A ) and V(d%>QL, Y,Z B ) > D B ) — > as n -> oo. 

When the criterion is of the form of vanishing probability of block-error (Sec. Ill-Al l, or of the form of expected per-sample 
distortion (Sec. IVI-Al i. for any fixed number t, a single-letter characterization of the set of all feasible coding rate-distortion 
tuples (the rate-distortion region) and the sum-rate-distortion function R sllm ,{D A ,DB), was provided in our previous work [1], 
[2] using traditional information-theoretic techniques. This characterization, however, does not directly lead to a satisfactory 
characterization of the infinite-message limit R sllm ^(D A , Db) '■= lim,_»oo^?j«m,/(-DA,-Ds)- The objective of this paper is to 
provide a characterization of R mm oo (D A , Db) which is not obtained by taking a limit as the number of messages goes 
to infinity and also an iterative algorithm to evaluate it. Understanding the sum-rate-distortion function in the limit where 
potentially an infinite number of alternating messages are allowed to be exchanged will shed light on the fundamental benefit 
of cooperative interaction in two-terminal problems. While asymptotics involving blocklength, rate, quantizer step-size, and 
network size have been explored in the distributed block source coding literature, asymptotics involving an infinite number of 
messages, each with potentially infinitesimal rate, has not been studied. The number of messages is a relatively unexplored 
resource and a new dimension for asymptotic analysis. 

This paper introduces a new convex-geometric approach to provide a blocklength-free single-letter characterization of 
the infinite-message sum-rate-distortion function as a functional of the joint source distribution and distortion levels. This 
characterization is not obtained by taking a limit as the number of messages goes to infinity. Instead, it is in terms of the 
least element of a family of partially-ordered, marginal-perturbations-concave functionals defined by the coupled per-sample 
distortion criteria. For computing the Boolean AND function of two independent Bernoulli sources at one/both terminals 
with zero Hamming distortion, the respective infinite-message minimum sum-rates are characterized in closed analytic form 
and shown to be achievable using infinitely many infinitesimal-rate messages. The functional viewpoint also leads to an 
iterative algorithm for evaluating any finite-message sum-rate-distortion functions. 

Related interactive computation problems have been studied extensively in the area of communication complexity [3], 
[4] where the main focus is on exact zero error computation, without regard for the statistical dependencies in samples 
across terminals, and where computing efficiency is gauged in terms of the order-of-magnitude of the total number of bits 
exchanged; not bit-rate (notable exceptions to this main focus are [5], [6]). Two-way distributed block source coding where 
the goal is to reproduce the sources with a non-zero per-sample distortion, as opposed to computing functions, was studied 
by Kaspi [7] who characterized the f-message sum-rate-distortion function in each direction. Orlitsky and Roche [8] studied 
two-terminal samplewise function computation with a vanishing block-error probability and characterized the feasible rates 
and the minimum sum-rate for two alternating messages (t = 2). A more detailed account of related work appears in [2]. 

If we choose the per-sample distortion function to be the Hamming distortion with respect to a function of X and Y and set 
the distortion level to zero, the characterization of the sum-rate-distortion function essentially reduces to the characterization 
of the minimum sum-rate of the problem with the criterion of vanishing probability of block-error [2, Proposition 3]. In 
this sense, the per-sample distortion criterion is more general than the vanishing probability of error criterion. For clarity 
of exposition, however, we first focus on the vanishing probability of block-error criterion in Sec. [TT] After formulating the 
problem in Sec. Ill- Al in Sec. lII-Bl we recap results from [1], [2] that are needed for the subsequent development. We present 
the characterization of the infinite-message minimum sum-rate in Sec. [TIT] We provide an iterative algorithm for evaluating 
any finite-message minimum sum-rate in Sec. [IV] We evaluate the infinite-message minimum sum-rates for two examples 
in Sec. [V] In Sec. [VI] we show how these results extend to the general rate-distortion problem. 

Notation: Vectors are denoted by boldface letters; the dimension will be clear from the context. The acronym 'rid' stands 



for independent and identically distributed and 'pmf ' stands for probability mass function. With the exception of the symbols 
R,D,N,A, and B, random quantities are denoted in upper case and their specific instantiations in lower case. For integers 
i, j, with ; < j, V- denotes the sequence of random variables Vt, . . . , Vj. For i > 1, V[ is abbreviated to V. If j < i then "Vj" 
denotes the void expression "". More generally, if {Q/l/es is a set of quantities Q indexed by a subset S of integers then for 
all integers i not in S , "Q" = "". For a set S, S" denotes the «-fold Cartesian product S x . . . x S. The support-set of a pmf 
p is the set over which it is strictly positive and is denoted by supp(p). If supp(g) C supp(p) then we write q <<c p. The set of 
all pmfs on alphabet 3K, i.e., the probability simplex in R'- 71 ', is denoted by A(^?l). X ~ Ber(p) means pxi^) = 1 — px(0) = p, 
and Ii2(p) denotes its entropy. X 1L Y means X and Y are independent. The indicator function of set S which is equal to one 
if x e S and is zero otherwise, is denoted by ls(x). Symbols A and superscript c represent Boolean AND and complement 
respectively. 

II. Interactive function computation problem 

A. Problem formulation 

We consider two statistically dependent discrete memoryless stationary sources taking values in finite alphabets. For 
i = 1, . . . , n, let (X(i), Y(i)) ~ iid pxy{x,y), x e X, y e J/, \X\ < oo, < oo. Here, pxy is a joint pmf which describes the 
statistical dependencies among the samples observed at the two terminals at each time instant i. Let /a : X x if — * Za 
and /b'.Xx}/—> Z.b be functions of interest at terminals A and B respectively, where Xa and Z.B are finite alphabets. 
The desired outputs at terminals A and B are Z A and Z B respectively, where for i = l,...,n, Z A (i) := fA(X(i),Y(i)) and 
Zb(0 := f B (X(i), 7(0). 

Definition 1: A two-terminal interactive distributed source code (for function computation) with initial terminal A and 
parameters (f, n, \M\\, . . . , \M t \) is the tuple (e\, . . . , e t ,gA,gs) of t block encoding functions e\, . . . , e, and two block decoding 
functions gA,gB, of blocklength n, where for j — 1, . . . , t, 

\ X" x Mi ^ Mj , if j is odd 
(line j) c ■ " \ 

1 ' \ J/"x (g)^ 1 Mi -> Mj , if j is even 

t 

(DecA) g A : X" x (g) Mj -» ZJ, 

;=i 

(Dec.B) J/" x (g) M,- ^ 

The output of ej, denoted by Mj, is called the j-th message, and t is the number of messages. The outputs of gA and gs 
are denoted by Za and Z# respectively. For each j, (l/n)log 2 \Mj\ is called the y'-th block-coding rate (in bits per sample). 
The sum of all the individual rates (1/n) Yfj=\ \-Mj\ is called the sum-rate. 

Definition 2: A rate tuple R = (R\, . . . , R t ) is admissible for f-message interactive function computation with initial 
terminal A if, Ve > 0, 3 N(e, t) such that Vn > N(e, t), there exists an interactive distributed source code with initial terminal 
A and parameters (f, «, |Ali|, . . . , |yVt, |) satisfying 

-\og 2 \Mj\<Rj + e, j = l,...,t, 

¥(Z A *Z A )<e, F(Z B *Z B )<e. 

Note that of interest here are the probabilities of block error ¥(Za + Za) and P(Z^ + Z£) which are multi-letter distortion 
functions. The set of all admissible rate tuples, denoted by Hf, is called the operational rate region for f-message interactive 
function computation with initial terminal A. The rate region is closed and convex due to the way it has been defined. The 
minimum sum-rate R^ lm , is given by min^2y=i^y) where the minimization is over R e "Rf. For initial terminal B, the rate 
region and the minimum sum-rate are denoted by 7?f and R B sllm , respectively. The focus of this paper is on the minimum 
sum-rate rather than the rate region. 



We allow the number of messages t to be equal to 0. When t — 0, there is no message transfer and the initial terminal 
is irrelevant. Thus for t = 0, in the notation for the minimum sum-rate, we omit the superscript and denote the minimum 
sum-rate as R sum ,o- 

For a given initial terminal, for t — and t = 1, function computation may not be feasible for general pxy, fa, fa- If the 
computation is infeasible, flf is empty and we set R A umt = +oo. If for some specific pxY,fa, fa, the computation is feasible, 
then R A um t will be finite. Note that for t > 2, the computation is always feasible and R A um , is finite. 

For all j < t, null messages, i.e., messages for which \Mj\ = 1, are permitted by Definition Q] Hence, a (t - l)-message 
interactive code is a special case of a f-message interactive code. Thus, R A ,, ,, > R A , and R A ., ., > R 8 ..,,,, (see 

" fc ' sum,(t-l) sumj sum,{t—l) sum J v 

[1, Proposition 1] for a detailed discussion). Therefore, lim^ao R A um , = lim,^oo R^ um , =: R SU m,°o- The limit R mm ,oo is the 
infinite-message minimum sum-rate. 

Depending on the specific joint pmf pxy and functions fa and fa, it may be possible to reach the infinite-message limit 
Rsum,oo with finite t (see end of Sec. IV-Bl for examples). 

For all finite t, a single-letter characterization of the operational rate region Kf and the minimum sum-rate R A um , were 
respectively provided in Theorem 1 and Corollary 1 of [1]. As discussed in Sec. III-Bl this does not, in general, directly lead 
to a satisfactory characterization of the infinite-message limit R SU m,<x which is a new, unexplored dimension for asymptotic- 
analysis in distributed block source coding involving potentially an infinite number of infinitesimal-rate messages. The main 
goal and contribution of this paper is the development of a general convex-geometric blocklength-free characterization of 
this infinite-message limit. 

B. Characterization of R A um , for finite t 

Let U\ , . . . , 1l t be finite alphabets whose cardinalities are bounded as follows 

f \x\{ui:l\u\) + t- j + 3, ,-odd, 

1 1 W(Uillm) +t-j + 3, j even. 
Note that these bounds are independent of blocklength n. For j = 1, j odd, let puixuJ- 1 denote a conditional pmf 
where for each (x, u^) e X X *Wi X . . . X Kj-i, Pu,\xui- l ('\ x > ') e M^j)- Similarly, for j = 1, . . . , f, j even, let Pu^yu'-' 
denote a conditional pmf where for each (y, u'~ l ) e }/ X H\ X . . . X Kj-i, Pu-iyw- 1 ( ly, «^') e ACZ/j). Let X, Y, U\, . . . , U, 
denote random variables taking values in X, J/, U\ , . . . , H, respectively with joint pmf pxyv = PxyPu<\xy where for all 
(x,y) e X X J/ and all u< e (g)| =1 %d, 

Pu'\XY(u'\x,y) = Pui\x(ui\x) ■ Pu 2 \YUi(u2\y,Ui) ■ /?u,|x£/2(m 3 |x,m 2 ) .... (2) 

Here, X and Y are referred to as the source random variables and IP as the auxiliary random variables. Note that Pu<\xy is 
a conditional pmf where for each (x,y) e X x J/, pu<\XY(-\x,y) e A("Z/i x . . . x U,). The factorization of pu'\XY(u'\x,y) in 
(O is equivalent to the following Markov chain conditions involving X, Y, U'\ for i = 1, if i is odd, t/, - (X, t/'~') - T 
forms a Markov chain, otherwise U, - {Y, U'~ l ) - X forms a Markov chain. Let 

P^c, '■= {all conditional pmfs Pu>\xy of the form (2)}. (3) 

Thus, P A CI is a family of conditional pmfs parameterized (continuously) by the conditional pmfs pu^x, Pu 2 \YUi, For 

finite f, P A 1C , is a compact subset of a finite-dimensional Euclidean space. Let 

fenuiPxY, fa, fa) i= {pv\XY i H(fa(X, Y)\X, U') = H(fa(X, Y)\Y, U') = 0}. (4) 

Note that for all t > 2, the set P ent ,t is not empty because one can choose U\ and U2 such that H{X\U\) = H(Y\U2) = 0: take 
U\ (respectively U2) to be a deterministic one-to-one mapping from X to U\ (respectively J/ to I/2) (note that \X\ < \U\\ 
and < |t/ 2 |). Also note that H(fa(X, Y)\X, U') and H(fa(X, Y)\Y, U') are continuous functionals of the joint pmf pzro<; 
and for each fixed pxy, they are continuous functionals of pu'\xy- Thus, for finite t, P e nt,t(pxY, fa, fa) is a compact subset 



of a finite-dimensional Euclidean space (since it is the contour of conditional pmfs on which the conditional entropies are 
equal to zero). Therefore, Pf (pxy,fA,fB) '■= ^mct n r Pem,i(PxY,fA,fB) is a compact subset of a finite-dimensional Euclidean 
space. Generally speaking, Pf is determined by pxy, /a, and fs- In the rest of this paper, however, and fs are fixed 
(but have general form) and pxy is variable. Therefore, we drop /a and fs from the notation and speak of the family of 
conditional pmfs Pf(pxy) associated with pxy- For initial terminal B, the corresponding set is denoted by Pfipxr)- We are 
now ready to state the characterization of R^ umt developed in [1]. 

Fact 1: (Characterization of R^ umt [1, Corollary 1]) 

< m ,= min [I(X-U'\Y) + I(Y-U'\X)]. (5) 

Pu'ixr^ P,(Pxr) 

Note that the conditional mutual information quantities in (0 are continuous functionals of the joint pmf Pxyif- In the 
minimization in (0, pxy, /a, and fs are fixed. Since we are minimizing a continuous functional over a compact set, a 
minimizer exists in Pf(pxy)- Since the arguments live in a finite dimensional Euclidean space, the minimization in (0 is a 
finite dimensional optimization problem. 

The characterization of R^ um , in (0 does not directly inform us how quickly Rj um , converges to R sumt00 , i.e., bounds on the 
rate of convergence are unavailable for general p X y, /a, and f B . In the absence of such bounds, one pragmatic approach to 
estimate R sumt oc is to compute R^ um , by numerically solving (with some machine precision) the finite-dimensional optimization 
problem in (0 for increasing values of t until the difference between R A sumt _ y and Rj llm , is smaller than some small number. 
Although (0 provides a single-letter characterization for Rj umt for each finite t, as t increases, an increasing number of 
auxiliary random variables U' are involved in the optimization problem. In fact, due to (0, the upper bounds for \U t \ increase 
exponentially with respect to t. Therefore, the dimension of the optimization problem in (0 explodes as t increases. Each 
iteration is computationally much more demanding than the previous one. To make matters worse, there appears to be no 
obvious way of re-using the computations done for evaluating R A umt _ 1 when evaluating Rj llmn i.e., every time t is increased, 
a new optimization problem needs to be solved all over again. Finally, if we need to estimate R SU m,oo for a different joint 
pmf pxy (but for the same functions /a and fs), we would need to repeat this entire process for the new pxy. 

In Sec. |III1 we take a new fundamentally different approach. We first develop a general convex-geometric blocklength- 
free characterization of R SU m,oo which does not involve taking a limit as t — > oo. Furthermore, instead of developing the 
characterization of R S um,oo for a fixed joint pmf pxy - which is a single nonnegative real number - we characterize the 
entire infinite-message minimum sum-rate surface R mm ^(pxy) - which is a functional of the joint pmf pxy - in a single 
concise description. This leads to a simple test for checking if a given achievable sum-rate functional of pxy coincides with 
Rsum.oaipxy)- It also provides a whole new family of lower bounds for R sum ,°o- In Sec. IIVI we use the new characterization 
to develop an iterative algorithm for computing the surfaces R S i, m ,co{pxy) and Rj um ,(pxy) (for any finite t) in which, crudely 
speaking, the complexity of computation in each iteration does not grow with iteration number and results from the previous 
iteration are re-used in the following one. In Sec. |V]we use the new characterization to evaluate R sum ,°o exactly, in closed 
analytic form, for two specific examples. For one of the examples (Sec. IV-At . in an earlier work we had derived an upper 
bound for R SU m,oo{pxy) using an achievable distributed source coding strategy that uses infinitely many infinitesimal-rate 
messages, but had been unable establish the optimality of that strategy. The new characterization, however, shows this to be 
optimal. In Sec. [VI] we show how these results extend to the general rate-distortion problem. 

III. Characterization of R S um,oo{pxy) 

A. The rate reduction functional pf(pxy) 

If the goal is to losslessly reproduce the sources (/a (x,y) = y, /s(x,y) = x), the minimum sum-rate is equal to H(X\Y) + 
H(Y\X) and this can be achieved by Slepian-Wolf coding. The sum-rate needed for computing functions can only be smaller 
than that needed for reproducing sources losslessly. The reduction in the minimum sum-rate for function computation in 



comparison to source reproduction is given by 

pf :=H(X\Y) + H(Y\X)-Rf umt = max [H(X\Y, U') + H(Y\X, U')]. (6) 

Pu'\xr£ KiPxy) 

For interactive distributed source codes with initial terminal B, the minimum sum-rate and rate reduction are denoted by 
R B sumt and pf respectively. A quantity which plays a key role in the characterization of R S um,oo is p^ corresponding to the 
"rate reduction" for zero messages (there are no auxiliary random variables in this case). Since the initial terminal has no 
significance when t — 0, p(J = p* =: po. Let 

Pf** - {Pxy e A(X x J/) : H(fa(X, Y)\X) = H(f B (X, Y)\Y) = 0}. 

Error- free computations can be performed without any message transfers if, and only if, pxy G P/a/d- Thus, 



R 



sum,0 



0, if pxr e <P fAfl , 
+oo, otherwise, 



I H(X\Y) + H(Y\X), xf PxY eT hfB , 

Po = \ , . (7) 

I -oo, otherwise. 

Remark 1: If fA{x,y) is not a function of x alone and fa(x,y) is not a function of y alone, then for all pxy G r Pf A f B ^ we 
have supp(pxy) ^ X x }/. Such pxr can only lie on the boundary of the probability simplex A(X x J/). 

Evaluating Rf umt is equivalent to evaluating the rate reduction pf . Notice, however, that in (|6]), all the auxiliary random 
variables appear only as conditioned random variables whereas this is not the case in ©. As discussed in Sec. IIII-C1 this 
difference is critical as it enables us to characterize p^ := lim f ^ M pf = lim f ^ M pf which then gives us a characterization of 
Rsum,oa as R mm ,co = H{X\Y) + H{Y\X) -p M - The rate reduction functional is the key to the characterization. 

B. Marginal-perturbations-closed family of joint pmfs Pxy 

Generally speaking, Rj unht , pf, R sum ,o and p M are functionals of p XY , /a, and f B . We will view R A sumJ {pxY), P?(Pxy), 
Rsum,oa(pxY) and Poc(pxy) as functionals of pxy with fa and fa fixed to emphasize the dependence of pxy- Instead of 
evaluating poa(pxY) for one particular pxy as it is done in the numerical evaluation of single-terminal and Wyner-Ziv rate- 
distortion functions, our approach is to evaluate Poo(pxy) for all pxy belonging to Pxy - a collection of joint pmfs of interest 
which is closed in the sense of Definition H] We will develop a characterization of Poo(pxy) for the entire pmf-collection 
Pxy', not just for one particular pxy- Central to the definition of Pxy is the idea of a marginal perturbation set which is 
discussed next. 

Definition 3: (X-marginal and Y -marginal perturbation sets Py\x(Pxy) and Px\y(Pxy) ) The set of X-marginal perturbations 
of a pmf pxy £ x J/) is defined as 

Py\x(Pxy) ■= {Pxy 6 x '■ Pxy ^ Pxy,p'xyPx = PxyPx) 
where px and p' x denote the X-marginals of pxy and p' XY respectively. Similarly, let 

VxwipxY) ■= {Pxy 6 A (^" x '■ Pxy ^ Pxy,PxyPy = PxyPy) 

denote the set of F-marginal perturbations of pxy where /?y and p' y denote the F-marginals of pxy and p' XY respectively. 

The sets Py\x( P xy) and Px\y(Pxy) are nonempty as they contain p X y- Notice that a pmf p' XY e Py\x(Pxy) iff p' x ^ Px 
and V(x,y) e supp(p^,) x J/, p' Y ^ x (y\x) = py\x(y\x), where p' x , p' Y ^ x {y\x) and P x<Pr\x(y\x) are X-marginal and conditional pmfs 
of p' XY and pxy respectively. Essentially, Py\x(Pxy) is the collection of all joint pmfs p' which have the same conditional 
pmf py\x or p' XY = py\x • p' x on supp(p^ y ). The subtlety is that the conditional pmf P L X of the joint pmf p' XY is well-defined 
only on supp(p^.) x J/. Corresponding statements can be made for Px\y(Pxy)- Marginal perturbation sets can be viewed as 
neighborhoods of pxy- 



Remark 2: For all p X Y- (i) Py\x(Pxy) and Px\y(Pxy) are convex sets of joint pmfs; (ii) if p' XY e Py\x(Pxy) then Py\x(p' X y) - 
Py\x(Pxy); and (iii) if p' XY e ?V(pxy) then P X \y(p' xy ) £ ?V(p*x)- 

We will develop a characterization of Poo(pxy) f° r all p,yy belonging to any family of joint pmfs P X y which is closed 
with respect to X-marginal and T-marginal perturbations. 

Definition 4: (Marginal-perturbations-closed family of joint pmfs Pxy) A family of joint pmfs Pxy £ A(<Y X J/) will be 
called marginal-perturbations-closed if for all pxy e Pxy, Py\x(pxy) U Px\y(pxy) £ ^at- 

Examples of such marginal-perturbations-closed families of joint pmfs include (i) the set of all joint pmfs with supports 
contained in a specified subset of Xx}J, i.e., Pxy - A(S) where S c Xx J/ and (ii) the set of all joint pmfs of all independent 
sources: Pxy = \PxPy\Px £ MX),p Y £ A(J/)} (see Sec. 0. In fact, if qxqy belongs to any marginal-perturbations-closed 
family with supp(gx) = X and supp(gy) = J/, then the family contains A(X) x A(J/), that is, all product pmfs on X x J/. 

C. Ma/n result 

To describe the characterization of the functional R SU m,oo(pxY), it is convenient to define the following family of functionals 
associated with computing fa and fig. 

Definition 5: (Marginal-perturbations-concave, po-majorizing family of functionals T(Pxy)) Let Vxy be any marginal- 
perturbations-closed family of joint pmfs on A(X x J/). The set of marginal-perturbations-concave, po-majorizing family of 
functionals T(Pxy) is the set of all the functionals p : Pxy R satisfying the following three conditions: 

1) po-majorization: Vp^y e Pxy, p(pxy) > Po(pxy)- 

2) Concavity with respect to X-marginal perturbations: Vpxr 6 Pxy, P is concave on Py\x(Pxy)- 

3) Concavity with respect to T-marginal perturbations: Vpxr 6 Pxy, P is concave on Px\y(Pxy)- 

Remark 3: Since po(pxy) = -°° for all pxy t r Pf A f B ^ condition 1) of Definition is trivially satisfied for all pxy £ 
'Pxy \ Pf A f B ( we use the convention that Va e R, a > -oo). Thus the statement that p majorizes po on the set Pxy is 
equivalent to the statement that p majorizes H(X\Y) + H(Y\X) on the set Pf A f„ f] Pxy- 

Remark 4: Conditions 2) and 3) do not imply that p is concave on Pxy- In fact, Pxy itself may not be convex. For 
example, the set Pxy = \PxPy\Px e &(X),Py e A(J/)} is not convex. 

We now state and prove the main result of this paper. 

Theorem 1: (i) p^ e T(Pxy)- (ii) For all p e T(Pxy), and all p X Y e Pxy, we have pco(pzy) < p(Pxy)- 

The set T{Pxy) is partially ordered with respect to majorization. The theorem says that T(Pxy) has a least element and 
that poo is the least element. Note that there is no parameter t which needs to be sent to infinity in this characterization of 

Poo- 

To prove Theorem[T]we will establish a connection between the f-message interactive coding problem and a (f-l)-message 
interactive coding subproblem. Intuitively, to construct a f-message interactive code with initial terminal A, we need to begin 
by choosing the first message. This corresponds to choosing the auxiliary random variable U\. Then for each realization 
U\ — Mi, constructing the remaining part of the code becomes a (r— l)-message subproblem with initial terminal B with the 
same desired functions, but with a different source pmf Pxy\Ui(', -|wi) e 'Py\x(Pxy)- We can repeat this procedure recursively 
to construct a (f - l)-message interactive code with initial terminal B. After t steps of recursion, we will be left with the 
trivial 0-message problem. 

Proof: (i) We need to verify that p M satisfies all three conditions in Definition [5] 

1) Since Vp XY e Pxy, RswhAPxy) < RsumdPxv), we have p^Pxy) > Po(Pxy)- Thus p x is p -majorizing. 

2) For an arbitrary qxY e Pxy, consider two arbitrary joint pmfs pxy,i,Pxyo 6 Py\x(<Ixy)- For every A e (0,1), let 
Pxy„\ := Ap X Y,i +(1 - A)p X Yfi- Let p x ,o(x), p Y \x,o(y\x) and p x ,i{x), p Y \x,i(y\x) and p x „\, PY\x,,i(y\x) denote the X-marginal and 
conditional pmfs of Pxy,o and pxy,i and Pxy,i respectively. Due to Remark |2ji), Pxy.x e ^YixigxY)- We need to show that 
Pcc(pxya) > ^Poo(pzyi) + (1 - X) poipxYfl)- 



Let (X, Y) be a pair of source random variables with joint pmf Pxyj- Consider an auxiliary random variable U* taking 
values in <U\ := {0, 1} such that (X, Y, U\) ~ Pxy,aPu\\x where Vx e supp(p X/i ), pu*\ X {\\x) := Ap x ,\{x) I p x , A (x) and Pu[\x(P\x) := 
(1 - A)px,o(x)/px,A(x). 

It follows that the marginal pmf of U\ is Ber(/l) and Y—X—U\ is a Markov chain. Consequently, V(x, u\) 6 supp(/7x,,))x1/[, 
Px|uj(Jc|«i) = Px, m (x) and V(*,y,Ki) e supp(p Z r„i) X pypr,[/;Cy|*,Hj) = PrpwCyl*). 

The key implication is that V(x,y,ui) £ supp(pxr.i) x 1/*, pxY|[/j(*,y|Mi) = PxY, Ul (x,y). This is because pxY\w(x,y\ui) = 
Px,m(x) ' PrpuCyW = Px, Ul (x) ■ PY\x,u,(y\x) = p X Y,u\i x ^y) where in the last but one equality we used the crucial property that 
all joint pmfs in 'Py\x(Rxy) have the same conditional pmf. 

Now, for all t £ Z + we have, 

pfipxYj) = max \H(X\Y,U') + H(Y\X,U')} 

Pu'ixr^ "PfiPrrji) 



= max { 

Pu { \x 



(a) 



max {H(X\Y, U') + H(Y\X, U')\ 

Pu'\XYUj '• * 

Pu^xPu^xru^PfiPxyj) 

> max \H(X\Y, U' 2 , t/*) + #(T|X, U' 2 , £/*)} 

Pu' 1 \xP u < 2[ xYW; € 'P?(Pxr,A) 



If) 



A ■ max {H(X\Y, U' 2 , U\ = 1) + ff(F|X, C/|, I/J = 1)} 

/fy^lxyt/jC'IV'l)- 
/'i;;ix/'^ixTO; e!P ?(wi'.i) 

+ (I -A)- max |fl'(Z|y,[7^f/ 1 * = 0)+//(FKi7^C/* = 0)} 

Piq[xP U ' 2l xYW l e ' p f(Pxrfl) 

A ply(pXY,l) + ( 1 - A) P f_j (p^ ). (8) 



In step (a) we replaced pu,\x with the particular pu\\x defined above. Step (b) follows from the "law of total conditional 
entropy" with the additional observations that conditioned on U\ = u\, pxY\u\i x ,y\ u \) = PxY,m{x,y) and (H(X\Y, U' 2 , U\ = 
Mi) + H(Y\X, U' 2 , U\ = «i)) only depends on Pu' 2 \xyu\('Y, "> u i)- Step (c) is due to the observation that for a fixed pu\\x, 
conditioned on U* = m, (i) Pu]\xPu' 2 \xyu[ e Pf ncJ iff Pu' 2 \xyu[ e r P^ nct _ x and (ii) Pu;\xPu' 2 \xyu[ e PemAPxY^jA, /b) iff 
Pu;\xyu\ e Pem,t-\{pxY, Ui ,fA,fB)- Therefore, pir^xPv^YVl e 'PfiPxYu,) iff P^ixfu; 6 'Pf^iPxY.u,)- Now send f to infinity in 
both the left and right sides of ®. Since lim^oopf = lim,^ pf = p ra , we have PoApxy.a) > Ap m (pxY,i) + (1 - A) p^ipxYfl)- 
Therefore, p^ satisfies condition 2) in Definition [5] 

3) In a similar manner, by reversing the roles of terminals A and B in the above proof, it can be shown that px, also 
satisfies condition 3) in Definition [5] Thus, p x E TCPxy)- 

(ii) It is sufficient to show that: Vp e TCPxy), ^Pxy e Pxy, Vf 6 Z + U(0), P?(pxy) < p(Pxy) and pfipxy) < p(Pxy)- We 
prove this by induction on t. For f = 0, the result is true by condition 1) in Definition |5J Pq(pxy) = Pq(Pxy) = Po(Pxy) ^ 
P(Pxy)- Now assume that for an arbitrary t E Z + , pf_j(pzy) ^ p(Pxy) and p^_ y {pxY) ^ p(Pxy) hold. We will show that 



P?(Pxy) < p(Pxy) and pf(p XY ) < p(pxr) hold 

pf(PXY) 



max {H(X\Y, U 1 ) + H(Y\X, U')\ 

PuVr^Pf(Pxr) { ' 



max<; 



max {H(X\Y, U') + H(Y\X, U')} 

PU\\XPul> \XYU\ 



id) 



(e) 



if) 
< 



maxi 

Pu,\x 



^ PuMl) 



max \H(X\Y, U' 2 , U x = Mi) + H(Y\X, U' 2 , U x 
PviWPu'txru, £ P?(PxWi (-.'I"!)) 



Ml) 



m£lX 1 X P£7i("l)P^-l(/'X7|£/ I (-,-|z<l))[ 

PU^X 

\u\Si. SUpp(p(?,) J 

maX l X Pt7i(Kl)p(/»X11U,(-,-|Ml))[ 



(9) 



Or) 

< max < p 
= p(pxr)- 



PUi(Ul)pxY\Ul(->-\ U l) 



\u,e supp(/> 1; ) 



The reasoning for steps (d) and (e) are similar to those for steps (b) and (c) respectively in the proof of part (i) (see equation 
array ([8])) but for step (e) we need to also confirm that Pxy\Ui(~, -|«i) e ( Py\x{Pxy) for all u\ e supply,). This is confirmed 
by noting that since Y — X— U\ is a Markov chain, V«i 6 supply,) and Vx e supp(/?x), we have Py\x\J\ Cy|jc, u\) = pr\x(y\x) 
(see para after Definition 0J. Step (f) is due to the inductive hypothesis pf^ipxy) ^ p(Pxy)- Step (g) is Jensen's inequality 
applied to p(pxy) which is concave on Py\x(Pxy)- Using similar steps as above, we can also show that pf(pxy) ^ p(Pxy)- 

U 

Remark 5: In the proof of Theorem [1] there are only two places where the marginal-perturbations-closed property of 
fxY is used. It is first used in part (i) to show that pxY\u\{x,y\u\) = pxY,ui{x,y). It is used in part (ii) to show that 
Pxy\U! (•» -I"i) e "Py\x(Pxy) ■ 

Remark 6: It can be verified that the functional (H{X\Y)+H(Y\X)) belongs to !F(A(<YxJ/)). Whereas both (H(X\Y)+H(Y\X)) 
and Poo(pxy) are concave on X-marginal and T-marginal perturbation sets of pxy, it cannot be claimed that R S um,°a{pxY) - 
(H(X\Y) + H(Y\X)) - Pco(pxy) will be convex on the marginal perturbation sets of pxy- For each t, pf is the maximum of 
(H(X\Y, U') + H(Y\X, U')), where U' appear only as conditioned random variables. This enables us to use the "law of total 
conditional entropy" (which corresponds to convexification) and arrive at ([8]) and (0. Notice, however, that R su „ hCO is the 
minimum value of (I(X; U'\Y) + I(Y; U'\X)) over all U' where U' are not conditioned. Therefore, R^ umt cannot be expressed 



as a convex combination of R B , , 

sum J- 1 



the 



Due to these reasons, although evaluating p«, is equivalent to evaluating R SU m,oo 
rate reduction functional is the key to the characterization as remarked in Sec. IIII-AI 

Since every p e 'FCPxy) gives an upper bound for p^, {H(X\Y) + H(Y\X) - p) gives a lower bound for R sum ,oa- This fact 
provides a way testing if an achievable sum-rate functional is optimal. If R* is a sum-rate functional which is achievable 
then VpxY e <P X y, R*(Pxy) > R SU mAPxr). If it can be verified that p* := (H(X\Y) + H{Y\X) - R*) belongs to T{9xy), then 
by Theorem [U R* = R sum ,<=o- The nontrivial part of the test is to verify if R* is concave on X-marginal and 7-marginal 
perturbation sets. We will demonstrate this test on two examples in Sec. [V] 

IV. Iterative algorithm for computing R^ um ,(■) and R sum ,oo(-) 
Although Theorem [1] provides a characterization of p«, and R sum ,oo that is not obtained by taking a limit, it does not directly 
provide an algorithm to evaluate R SU m,tx>- To efficiently represent and search for the least element of T{Pxy) is nontrivial 



because each element is a functional; not a scalar. The proof of Theorem Q] however, inspires an iterative algorithm for 
evaluating Rf umt and R sum ,oo- 

Equation (0 states that pf(pxy) is the maximum value of p € R such that (pxy,p) is a finite convex combination 
of {(pxYiuX-rlm), pf^ipxnuii; ■\ui))}mewpp(pu 1 )> where Pxy\Ui(->-\ u i) belongs to P Y \x(Pxy) for all u\ in supply,) c r U l , 
Consider the hypograph of p^Q on P Y \x(PxyY h yPp v <j, xr )Pf-i : = {(Pxr,p) ■ Pxy e Py\x(Pxy), P < P^Pxy)}- Due to 
©, the convex hull of hypp^^p^j is hyp Pyix(pxy) pf . This enables us to evaluate pf from pf { on the set P Y \x(PxyY pf 
is the least concave functional on Py\x(Pxy) that majorizes p?_ v In the convex optimization literature, (-pf) is called the 
double Legendre-Fenchel transform or convex biconjugate of (— p?-,) [9]. Thus pf can be determined through a convex 
biconjugation operation (taking a convex hull of a hypograph) on any given X-marginal perturbation set. To determine 
pf(Pxy) for all pxy e Pxy, we can, in principle, first choose a cover for Vxy made up of X-marginal perturbation sets, say 
\'Py\x{Pxy)}pw g ^, where J{ c 'Pxy, and then perform the convex biconjugation operation in every X-perturbation set in the 
cover. This relationship between pf and p B t j leads to the following iterative algorithm. 

Algorithm to evaluate Rf um and R B umJ 

• Initialization: Choose a marginal-perturbations-closed family Pxy containing all source joint pmfs of interest. Define 
Pq(Pxy) = Pq(Pxy) = Po(Pxy) by equation (fTJi in the domain Pxy- Choose a cover for Pxy made up of X-marginal 
perturbation sets, denoted by {P Y \x(pxY)}p xr e s\, where J[ C P^y. Also choose a cover for P^r made up of F-marginal 
perturbation sets, denoted by {Px\y(Pxy)} PX y£ a> where £ c Vxy- 

• Loop: For r = 1 through f do the following. 

For every pxy £ ^l, do the following in the set Py\x(Pxy)- 

- Construct hyp^^^j. 

- Let pf be the upper boundary of the convex hull of hyp P (p^yo^j. 
For every p^-y £ £>, do the following in the set Px\y(Pxy)- 

- Construct hyp^^^j. 

- Let pf be the upper boundary of the convex hull of hypp v|1 , (m) pf r 

. Output: Rf um ,{pxY) = H(X\Y) + H(Y\X) - pf(p XY ) and R B um>t (p XY ) = H(X\Y) + H(Y\X) - pf(p xr ). 

To make numerical computation feasible, Pxy has to be discretized. Once discretized, however, in each iteration, the 
amount of computation is the same and is fixed by the discretization step-size. Also note that results from each iteration are 
re-used in the following one. Therefore, for large t, the complexity to compute Rf llm , grows linearly with respect to t. 

Rsum.oo can also be evaluated to any precision, in principle, by running this iterative algorithm for t — 1,2, . . ., until some 
stopping criterion is met, e.g., the maximum difference between pf , and pf on Pxy falls below some threshold. Developing 
stopping criteria with precision guarantees requires some knowledge of the rate of convergence which is not established 
in this paper; the rate may, however, be empirically estimated. For the example presented in Sec. IV-BI the process of 
convergence and the impact of the discretization step-size to the iterative evaluation is discussed. When the objective is to 
evaluate R SU m,<x(Px Y ) for all pmfs in Pxy, this iterative algorithm is much more efficient than using © to solve for Rf um , 
for each pxy for t — 1,2, . .., an approach which follows the definition of R SU m,tx> literally as the limit of Rf um , as t — > oo. 
Our iterative algorithm is based on Theorem [T] which is a characterization of R SU m,co without taking a limit involving t. 

Since -pf is the convex biconjugate of -pf_ l on all X-marginal perturbation sets and -pf is the convex biconjugate of 
~pf-i on all T-marginal perturbation sets, it follows that for all t > 0, pf satisfies conditions 1) and 2) in Definition [5] 
(po-majorization and concavity with respect to X-marginal perturbations), and pf satisfies satisfies conditions 1) and 3) 
(po-majorization and concavity with respect to F-marginal perturbations). By Theorem [T] p x satisfies all three conditions 
of Definition [5] and is not larger than any p which satisfies all three conditions. Also for all f, by definition, pf < p^ and 
pf < Poo- Hence, if for some t, pf satisfies 3) then pf = p x . Similarly, if for some t, pf satisfies 2) then pf = p^. Thus, pf 
and pf equal p M iff they satisfy all three conditions. If all three conditions are not satisfied (two are always satisfied), it is 



beneficial to increase the number of messages. Specifically, if pf is not concave on a ^-marginal perturbation set, then for 
some pxy, pfifxy) < pf + i(pxy) ^ pf^ipxy)- Note that if for some t > 0, pf = pf v then this functional must satisfy all three 
conditions, therefore pf — p^. That is, if it is not beneficial to add one message in the beginning of the communication, it 
is never beneficial to add arbitrarily many messages. 



A. R S um,co for independent binary sources and Boolean AND function computed at both terminals 

In [1, Sec. IV.F], we studied the samplewise computation of the Boolean AND function at both terminals for independent 
Bernoulli sources, i.e., X — J/ = {0, 1}, X JL Y, X ~ Ber(p), Y ~ Ber(q), and f^x^y) - fB(x,y) — x A y. An interesting 
interactive coding scheme was described in [1] where the individual rate for each message vanished as the number of 
messages went to infinity. The (achievable) infinite-message sum-rate of this scheme, denoted by R*, was evaluated in 
closed form as 



This expression was derived in [1, Sec. IV.F] for the situation < p < q < 1. The situation < q < p < 1 follows by 
symmetry. The remaining situations pq — and (1 -p)(l -q)-0 easily follow using zero or one message. Since R*(p,q) is 
an achievable sum-rate, R* > R sum ,°o- Using TheoremQ] we shall now prove that R* is, in fact, equal to R sum ,oo- We will verify 
that p* :- H(X\Y) + H(Y\X) - R* belongs to T(Pxy) for the product pmf family Pxy, which will imply, by Theorem |TJii), 
that p* > pea, i.e., R* < R SU m,oo- Note that R SU m,oo is not evaluated using TheoremQ] Only part (ii) of Theorem[T]is used as a 
converse proof to show that the achievable sum-rate R* is R mm ,ca- 

Since the sources are independent, we take the marginal-perturbations-closed family to be Pxy = \PxPy\Px S A{X),p Y e 
A(J/)}. For each product pmf p x py, the ^-marginal and 7-marginal perturbation sets are Py\x(PxPy) = {p' x Py '■ p' x Px\ ar, d 
*Px\y(PxPy) - [PxPy '■ p'y Py) respectively. Since p x and p Y are parameterized by p and q respectively, each product pmf 
PxPy can be represented by a point (p, q) 6 [0, l] 2 . For all pmfs (p, q) 6 (0, l) 2 , the X-marginal and T-marginal perturbation 
sets are line segments [0, 1] x {g} and {p} x [0, 1] respectively. For all pmfs (Q,q), where q e (0, 1), the X-marginal and 
7-marginal perturbation sets are (0, g) and {0} x [0, 1] respectively. For the pmfs (0,0), both the X-marginal and T-marginal 
perturbation sets are (0, 0). The marginal perturbation sets of remaining pmfs on the boundary of [0, l] 2 can be derived using 
symmetry (swap p and q; then swap symbols and 1). 

It is easy to see that 



where Pf A / B = {(p, q) : p — or q — or p — q — 1 }. It is also easy to verify that for all (p, q), R*(p, q) < R SU m,o(p, q) = 0, or 
equivalently, p*(p, q) > po(p, q). By taking the first and second-order partial derivatives of p*{p, q) = h^ip) + hi(q) - R*(p, q) 
with respect to p and q, we can verify that for any fixed q, p*(p, q) is concave with respect to p, and for any fixed p, p*(p, q) 
is concave with respect to q. Therefore, p*(p, q) is concave in every X-marginal and T-marginal perturbation set. Therefore, 
p*(p,q) e TCPxy), which implies that R S um,oo(p,q) > R*(p,q) due to Theorem [TJii)- Since R*(p,q) is both an upper bound 
and a lower bound of R mm ,°°(p, q), we have R SU m,oo(p,q) = R*(p,q). 

B- R S um,oo for independent binary sources and Boolean AND function computed at only terminal B 

We change the problem in Sec. IV-AI to the problem of computing the Boolean AND function at only terminal B, i.e., 
fA(x,y) = and /s(x,y) — x Ay. The source statistics are unchanged: X _1L Y, X ~ Ber(p), Y ~ Ber(q). An achievable 



V. Examples 




(10) 




sum-rate R* presented below can be derived using the same technique presented in [1, Sec. IV.F]. 

/ 

h 2 (p) + ph 2 (q) + p log 2 q + p(l - 2q) log 2 e, if < p < q < 1/2, 



R*{p,q) = 



R*(q,p), if <<?</>< 1/2, 

R*(l-p,q), if < q < 1/2 < p < 1, 



h 2 (p), if 1/2 < ^ < 1. 

The proof of the achievability of R* is given in Appendix U Following the method in Sec. IV-AI it can verified that Pf A f„ = 
{(p,q) : p = or q = or p = 1} and p*(p,q) = (h 2 {p) + h 2 {q) - R*(p,q)) belongs to T(Pxy), where P X y = \pxPy\px e 
A(X), py e A(J/)} is the same marginal-perturbations-closed family used in Sec. IV-AI Therefore, R* = R sum ,<x>- 

Iterative algorithm: Now we use this example to demonstrate the numerical implementation of the iterative algorithm 
discussed in Sec. [IVJ 

. Initialization: Choose Vxy = [0, l] 2 . Choose = {(1 /2, g , )} 9 e[o,i], which leads to a cover for Vxy made up of X- 
marginal perturbation sets {[0, 1] x {^}} 9 e[0,1]- Similarly, choose S = {(/?, l/2)}p E [o,i], which leads to a cover made up of 
T-marginal perturbation sets {{p} x [0, l]} pe [o,i]- 

Ki \)N-l N-l 
jvTT> jv~rJJ _ _ ■ The 

two covers are correspondingly discretized into the collection of the columns and the collection of the rows of V*^. 
Compute p A (p,q) = p B (p,q) = po(p,q) according to (|7]i for all (p, q) € P* XY as follows. 



Poip, q) = 



h(p) + h{q), if p = Q or q = or p = I, 
-co, otherwise. 



• Loop: For t = 1 through t do the following. 

- For every j e [0,...,N- 1), do the following. Let f Hjip^_^) to be the set of points {j^\,p^_i (aHT> jv-t)) f° r a ^ 
i e {0, . . .,N - 1} such that p B x [j^, j/^) + -co. For i e {0, . . . , N - 1}, define p^ [j^, j/zA in such a way that 
( jyrr , ( jyTY , 5v-t)) ^ s on tne u PP er boundary of the convex hull of "Hj(p B x ). If for some i, there is no point in 
the convex hull with the first coordinate then set p^ (j^, j^y) = — °o. With this definition, -p A is the convex 
biconjugate of —p B _ x on the X-marginal perturbation set taking the symbol -co into consideration. 

- For every i e [0,...,N - 1), do the following. Let < Hi(p^_ l ) to be the set of points ^j/^,p^_ l (j^rf. 7v~f)) f° r a ^ 
j 6 {0, . . . ,N - 1} such that p£_j [j^, j^) + -co. For j € {0, . . ., N - 1}, define pf (j^, j^i) in such a way that 

' Pt > 7f-i)) ^ s on tne u PP er boundary of the convex hull of "H,(p^ j). If for some j, there is no point in 
the convex hull with the first coordinate ^-y, then set p B (jjhi, 7^-7) = — 00. 
. Output: R^£ t (p,q) = Kp) + h(q) - p? IB (p,q) for all (p,q) on the grid. 

Fig. [2] shows some plots for the rate reduction functions with different t. Since for t > 2, p, and p M are hardly 
distinguishable, we use the brightness to show {p m - pt) = {R.mm,t - Rsum.co) in Fig. [3] to highlight the difference. Depending 
on specific joint pmf, the limit R sunh00 could possibly be reached by R sum ,t with a finite t. Specifically, for all (p, q) e Vf A f B , 
Rsum,o — — R SU m,co and no message needs to be sent. For all (p, q) e (0, 1) x [1/2, 1], R sum ,oo = h 2 (p) and this sum-rate can be 
achieved with t — 1 message from A to B, thus R A mm , = R S um,oo- Note that R SU m,o = 00 because (p,q) i Pf A f B and R B um x = °°. 
For (p,q) € {1/2} x (0, 1/2), R sum>00 = h 2 {q). In [8, Sec. V.C] it was shown that this sum-rate can be achieved with t — 2 
messages, the first from B to A and the second from A to B. Thus R B um2 = Rsum,™- Note that R B , = 00 and in [1, Sec. IV.C] 
we showed that R A mm , = log 2 2 = 1. For these distributions (p,q) discussed above, R SU m,oa can be reached by t — 0, 1 or 2. 
However, we can see from Fig. [3] that when (p,q) is closed to the line segments p — q < 1/2 and 1 - p — q < 1/2, we do 
need a large t to get a R sum ,t closed to R sum ,oo- 

When the iterative algorithm is numerically implemented to approximate p^, the accuracy depends on the number of 
iterations and the discretization step-size. Fig. Ufa) shows the dependency of the maximum error maXp^ipcaip, q) - p t (p, q)) 
with respect to t and N. For each N, the maximum error decreases as t increases until an error floor is reached. For a finer 





% 



% 



0.5 

t = 5 



1 



0.5 

t = 6 



1 



0.5 

t = 7 



0.5 



Fig. 3. Difference between p m and p, for x A y computed only at terminal B (Sec. \ V-B~[ . The brightness represents the scaled logarithm of (p^o - Pi)- The 
white color means a large (p m - p,) and the black color means (p^ - p,) < 10~ 4 . 

discretization with a larger Af, the error floor is lower. Fig. Hfb) shows the relation between the error floor level and the 
computation time needed to reach the error floor for different Af. Roughly speaking, when is doubled, the error floor level 
is halved and the computation time to reach the error floor is approximately multiplied by four. 




VI. Extension to interactive rate-distortion problem 

A. Problem formulation 

In [2] we studied the interactive coding problem with per-sample distortion criteria. Let d A '■ X x J/ x Za — * 3D and 
d B : X x J/ x J3b — > £) be bounded single-letter distortion functions, where T> := [0, c/ max ]. The fidelity of function 
computation can be measured by the per-sample average distortion 

1 " 1 " 

4"'(x,y,z A ) := - V d A (x(i),y(i),Ui)l df(x,y,z B ) := - V d B (x{i),y{i),ZB{i)). 
i=i /=i 

Of interest here are the expected per-sample distortions ^[^"'(X, Y, Z A )] and E[d^\X,Y,Z B )]. Note that although the 
desired functions f A and f B do not explicitly appear in these fidelity criteria, they are subsumed by dA and d B because 
they accommodate general relationships between the sources and the outputs of the decoding functions. The performance 
of f-message interactive coding for function computation is measured as follows. 

Definition 6: A rate-distortion tuple (R, D) = (R\, . . .,R t , Da,Db) is admissible for f-message interactive function compu- 
tation with initial terminal A if, Ve > 0, 3 N(e, t) such that V« > N(e, t), there exists an interactive distributed source code 
with initial terminal A and parameters (f, «, |Afi|, ■ • ■ , \M,\) satisfying 

-\og 2 \Mj\<Rj + 6, j=l,...,t, 
n 

E[d { f(X,Y,Z A )] <D A + e, E[df(X,Y,Z B )] <D B + e. 

The set of all admissible rate-distortion tuples, denoted by KDf , is called the operational rate-distortion region for t- 
message interactive function computation with initial terminal A. The rate-distortion region is closed and convex due to the 
way it has been defined. The sum-rate-distortion function Rj um ,(D) is given by min (Z;=i Rj) where the minimization is over 
all R such that (R, D) e RDf . For initial terminal B, the rate-distortion region and the minimum sum-rate-distortion function 
are denoted by JWf and R" um>t (D) respectively. For any fixed D, We define R SU m,oo(D) := lim,^ Rj lm /D) = lim,^ R B sum t (U). 

The admissibility of a rate-distortion tuple can also be defined in terms of the probability of excess distortion by replacing 
the expected distortion conditions in Definition |6]by the conditions ¥(d ( f(X, Y, Z A ) > D A ) < e and P(c/^ n) (X, Y, Z B ) > D B ) < e. 



Although these conditions appear to be more stringen{J, it can be shownj that they lead to the same operational rate-distortion 
region. For simplicity, we focus on the expected distortion conditions as in Definition [6] 

B. Characterization o//^ 1(m ,(p X y,D) and pf(pxY^) far finite t [2] 
The single-letter characterization of Rj um /pxY, D) is given by 

< m _/p X7 ,D) = min [I(X;U'\Y) + I(Y;U'\X)], (12) 

where Pf(p X y,D) := {(Pu •\xY,gA,gB) '■ Pu>\xy 6 Pf K , r deterministic functions gA, gB satisfying E [d^iX, Y, gA(U', X))] < 
Da, and E [d B (X, Y,g B (U', Y))] < D B ). Compared with (0, the expected distortion constraints replace the conditional entropy 
constraints. The rate reduction functional is defined as follows. 

p?(p X y,D) ■= H(X\Y) + H(Y\X) - Ri mu (D) = max [H(X\Y U') + H(Y\X, U% 

For t = 0, let <P hm := {p XY : 3g A ,g B , s.t. E [d A (X, Yg A (X))] < D A ,E[d B (X, Y,g B (Y))] < D B ). Then we have 

R (n Dl-I °' if ^*y,D )e Pf A f B v, 

K sum,0(PXY,") ~ i 

I +oo, otherwise. 

( H(X\Y) + H(Y\X), if (p xy ,D) € P /a/bD , 
Po(Pxy,D) = i (13) 
I -oo, otherwise. 

C. Characterization o/ ,/?„,„,, «>(p X y, D) 

We can use the same technique as in Sec. [Ill] to characterize the functional poo(p X y,D). 

Definition 7: (Marginal-perturbations-distortion-concave, po-majorizing family of functional "FdCPxy)) Let Pxy be any 
marginal-perturbations-closed family of joint pmfs on A(X x J/). The set of marginal-perturbations-distortion-concave, po- 
majorizing family of functionals TdCPxy) is the set of all the functionals p : Pxy X 'D 2 — > R satisfying the following 
conditions: 

1) po-majorization: Vp X y e P X y and VD e D 2 , p(pxy) ^ Po(Pxy)- 

2) Concavity with respect to X-marginal perturbations and distortion vector: Vp X y ^xf, P is concave on P Y \x(Pxy)x^D 2 - 

3) Concavity with respect to T-marginal perturbations and distortion vector: Vp X y £ Pxy, P is concave on Px\y(Pxy)xD 2 . 

The following characterization of poo(p X y,D) is the generalization of Theorem Q] to the rate-distortion problem. 

Theorem 2: (i) p 0O (p XF ,D) e T D (Pxr)- <$) For all p € r^xy) and V(p xy ,D) e !P xy x £> 2 , we have p ro (p xy ,D) < 
p(p X y,D). 

The proof of Theorem [2] is parallel to that of Theorem Q] 

Proof: (i) We need to verify that p m satisfies all three conditions in Definition Q 

1) Since R sum ,°o(PxY, D) < R sum ,o(pxY, D), we have p co (p X y,D) >p (p X y,D). 

2) For an arbitrary qxY e ^xy, consider two tuples (pxy ( i,Di), (pxy,o>Dn) e ^yrxC^xy) X "D 2 - For every A e (0, 1), let 
(Pxy,a,Va) ■= A(pxy,i,V\) + (1 - /i)(p X y.o, D ). We need to show that pUpxy.a, Da) > ^Pc°(pxy,i,Di) + (1 - T)pco(p X y,o,D ). 
Using the same method as in the proof of Theorem [1] part (i.2), we construct the joint pmf Pxyu\ with the following 
properties (PI) U\ is Ber(/l), (P2) V(x,y, mj) € supp(p xi y) x 1/*, p X y|uj(x,y|«i) = p AV , I(1 , and (P3) Y - X - U\ is a Markov 

'Any tuple which is admissible according to the probability of excess distortion criteria is also admissible according to the expected distortion criteria. 
2 Using strong-typicality arguments in the proof of the achievability part of the single-letter characterization of the rate-distortion region. 



chain. For every t e Z + , 



we have 



max 

(PW\XY ,SA ,gB)&^ (PXYJ ,D,|) 



[h(X\Y,U') + H(Y\X,U')) 



max { max \H(X\Y, U') + H(Y\X, U l ) 



Co) 
> 



max {#(X|y, t/j, t/*) + iZ(7[X, t/£, £/i 



(Pt/;ix("il-)/' [ ,f| X roj(-|-,-,«i),lA,gfi)£P?(Pxi', ni ,D ni ), "i£|0,l) 



,1 • 




{#(x|y u< 2 , u* = i) + ff(y|z, 14 t/* = i)} 




{//(x|y 14, i/* = o) + h(y\x, u' 2 , u\ = 0)} > 



(14) 



In step (a) we replaced pu,\x with the particular pu^x defined above, and replaced the overall distortion constraints 



E[d A (X, Yg A (U',X))] < D AJ and E[d B (X,Y,g B (U',Y))] < D BtA by the stronger individual distortion constraints E[d A (X, Y, 
g A (U* v U' 2 ,X))\U\ = mi] < D Aui and E[d B (X,Y,g B (U* v U' 2 ,Y))\U\ = u{\ < D B ,„, for U\ = or 1. Step (b) follows from 
the "law of total conditional entropy" with the additional observations that conditioned on f/* = U\, (H(X\Y, U l 2 , U* = 
Mi) + H(Y\X, U' 2 , U* = «i)) only depends on Pv t \KYU\{'\'y '> M iX £a(wi, . . .), and g B {u\, . . .). Step (c) is due to the observation 
that for a fixed /jj/; |x, conditioned on U* = u\, (pu' x \xPu> 2 \XYU- r gA,gB) e PfipxY.ui^ud iff (Pu 2 \XYU- r gA,gB) e ^-xiPxY.ui^ud- 
Now send t to infinity in both the left and right sides of (TBI) . We have p M (/?xy,,hD,t) > /tp M (pxy,i,Di) + (l -/l)p 00 (/?xr,o I Do). 
Therefore, p^, satisfies condition 2) in the Definition UJ Similarly, it also satisfies condition 3). 

(ii) It is sufficient to show that Vp e TdCPxy), for every tuple (pxy,D) £ !^xr X £> 2 , for every f E Z + U(0), pf (pxr,D) < 
pipxY, D) and pf(pxr,D) < p(pxr,D) hold. We show this argument by induction on t. For t — 0, the statement is true by 
condition 1) in Definition [7j Then we assume that for an arbitrary t e Z + , pf ^pxy,^)) < p(pxy,D) and p^j(pxy,D) < 



p(pxr,D) hold. We will show that pf(pxy,T>) < p(pxy,D) and pf (p^y, D) < p(pxy,D) hold. 



max [H(X\Y,U') + H(Y\X,U')\ 



= max i 

Pu,\x 



(d) 

= max < 



max Ih(X\Y, U') + H(Y\X, U')\ 

(Pu x \xPu' \xyu, (pxyV) 



PC, |X 



max<; 



max 

D^e^VuiEl/i: 

£[Dy, ]=D 



max 

D^eD^VhieI/i: 

£[Dy, ]=D 



{//(X|T, U') + H(Y\X, U')) 



max 

(Pu' 2 \XYU l •gA,gB)- 

(Pu, lx(u\\-)Pu! ii xYu l (■K->"i)*jB)ePf(pzr|Ui(-,-|Ki),D« 1 ) 



2^/>ul("l) 

Ml6SUpp(pUj ) 



max f#(X|T, U' 2 , U x = «i) + H(Y\X, U' 2 , U x = u x )\ 

(P^|xyyj(-|v,"l)*?A(«l--0,,?fl(»l ■■•)): 1 

(PuJ|x("il-)P^|xFc;;(^'''' 1< i)'?A("i-) > &("i-))e? > f teriu, (•.•|«iXD» 1 ) 



GO 

= max i 

PUl\X 



(g) 

= max 

Pu,\x 



max 1 A PuMtip^ipxYWti.; -|"i),D«i) 

maX I A P£/i("l)P(PX7|£/iO,-|«l),D Ml ) 
D^E^.VidEl/,: = z — ' . 

1 ~ ~ I Hi £ SUPPlDf/. ) 

£[D tI| ]=D 1 FP ' ' 



(15) 



(16) 



(A) 

< p(p x - Py\x,D a ,D b ). 

The reasoning for steps (d), (e) and (f) are similar to that for steps (a), (b) and (c) in the proof of part (i). In step (d) when 
the overall distortion constraints are replaced by the individual distortion constraints, the maximum is not changed, because 
we go through all the possibilities for the individual distortion levels D„, satisfying ^[Df/J = J^ u D,,,/?^ («i) = D. In step 
(f) we need to confirm that pxY\u x {'r\ui) £ ^ypKfxr)- The reasoning is as same as in step (e) in the proof of Theorem Q] 
Step (g) is due to the inductive hypothesis p* Y {pxY, D) < p(pxr,D). Step (h) is due to the Jensen's inequality applied to the 
concave functional p. Using similar steps as above, we can also show pf(pxy,D) < p(pxr,D). ■ 
Theorem [2] conveys the same intuition discussed in Sec. [Ill] The main difference is that for each realization U\ =u\, the 
distortion vector D„, in the (t - l)-message subproblem could be different from the original distortion vector D, as long as 
E[Du t ] = D. Therefore, we need to convexify over the distortion vector. 

D. Iterative algorithm for computing 7^ l(m ,(pxr, D) and R SU m,cn(pxY,T)) 

The iterative algorithm presented in Sec. [IV] can also be extended to the rate-distortion problem as follows. Equation ( fToT ) 
states that p^(pxr,D) is the maximum value of p E R such that (pxy, D,p) is a convex combination of {(pxy\u,('> "|mi),D m1 , 
pf-iO'xif/.OHMiXDK, )))«,e supply, )■ Consider the hypograph of pf^O on Py^xD 2 : h yP<p rix ( Pxr )x&-P?-i : = i(PxY,V,p) ■ 
(Pxy,V) e P Y \x(pxy)x£> 2 ,P < pf^ (pxy, D)}. Due to (O, the convex hull of hypp^^^pf^ is hyp Pm(p](r)x& pf . Therefore, 
we have the following algorithm which is similar to the one presented in Sec. [Till 

Algorithm to evaluate Rj um /D) and < m ,,(D) 

. Initialization: Choose a marginal-perturbations-closed family Pxy containing all source joint pmfs of interest. Define 
p£(pxY,V) = Pq(pxy,T>) = po(pxr,D) by equation $13[ in the domain Pxy X £> 2 - Choose a cover for Pxy made up of 
X-marginal perturbation sets, denoted by \Py\x{Pxy)} Pxv e where J{ c P X y- Also choose a cover for Pxy made up of 
T-marginal perturbation sets, denoted by \Px\y{Pxy)} Pxr e &, where & Q P X y- 



• Loop: For t — 1 through t do the following. 

For every p XY e 3k, do the following in the set Vt^Pxy) x 2D 1 . 

- Construct ^Vf> m ( PxY )x&P B T -v 

- Let p^ be the upper boundary of the convex hull of ^YPpy^ip^xsyPr-v 
For every p XY e 23, do the following in the set P X \ Y (p XY ) x 2D 2 . 

- Construct hypp^G^xfl^r 

- Let pf be the upper boundary of the convex hull of hyPp w ( Pxr )xD 2 PT-\- 

. Output: R A mnu {p XY ,T>) = H(X\Y) + H(Y\X) -p?(p XY ,T>), and R B sum ,(p XY ,T» = + H(Y\X) -pf(p XY ,T>). 

Here we need to discretize the set P XY x ID 2 . Rsum.ooipxY,^)) can also be evaluated to any precision by running this 
algorithm to a large enough value of t, until the change between p^ipxY,^) and pf(p XY ,D) is below a certain threshold. 
In the special case t = 1, and cIa = 0, the interactive problem reduces to the Wyner-Ziv problem. If we further assume that 
|J/| = 1, the Wyner-Ziv problem reduces to the single-terminal rate-distortion problem. Therefore, the algorithm described 
above can be used to evaluate the single-terminal and Wyner-Ziv rate-distortion functions as special cases. 

VII. Concluding remarks 

In this work, we studied a two-terminal interactive function computation problem with alternating messages within 
the framework of distributed block source coding theory. We introduced a new convex-geometric approach to provide a 
blocklength-free single-letter characterization of the infinite-message sum-rate-distortion function as a functional of the joint 
source pmf and distortion levels. This characterization is not obtained by taking a limit as the number of messages goes to 
infinity. Instead, it is in terms of the least element of a family of partially-ordered, marginal-perturbations-concave functionals 
defined by the coupled per-sample distortion criteria. An interesting direction would be to find an efficient algorithm to search 
for the least element in the family of functionals. 
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Appendix I 

ACHIEVABILITY OF R* IN SEC IV-BI 

The achievability proof of R*(p,q) for (p,q) e (0, 1/2] 2 uses the same technique presented in [1, Sec. IV.F]. Once we 
have established R*(p,q) for (p,q) £ (0, 1/2] 2 , R*(l - p,q) is an achievable sum-rate for (p, q) e [1/2, 1) x (0, 1/2]. The 
reason is that when p > 1/2, X c ~ Ber(l - p) where (1 — p) < 1/2. Using the achievable scheme for (p,q) e (0, 1/2] 2 , B 
can compute X c A Y. Then X A Y = (X c A Y) c A Y can be computed locally at B. For 1/2 < q < 1, the rate H(X) - h 2 {p) is 
achieved using one message from A to B sending X. Now we will show the achievability of R*(p,q) for (p, q) e (0, 1/2] 2 . 

Define real auxiliary random variables (V x , V y ) ~ Uniform([0, l] 2 ). If X := l[i- Pl i](V x ) and Y :- l[i-o,i](V y ), then (X, Y) 
has the correct joint pmf, i.e., p x {\) — 1 - p x (Q) = p, pr(l) = 1 - Py{0) = q and X _1L Y. We will interpret and 1 as 
real zero and real one respectively as needed. This interpretation will allow us to express Boolean arithmetic in terms of 
real arithmetic. Thus X AY (Boolean AND) = XY (real multiplication). Define a rate-allocation curve F parametrically by 
F := {(a(s),B(s)), < s < 1} where a and B are real, nondecreasing, absolutely continuous functions with a(Q) = B(Q) = 0, 
ar(l) = (1 —p), and B(V) e [0, 1 -q]. Note that in [1, Sec. IV.F] where the AND function is computed at both terminals rather 
than only terminal B, F need to satisfy a different condition B(\) = (1 - q). The significance of F will become clear later. 
Now choose a partition of [0, 1], = So < s\ < . . . < s t /2-i < St/2 = L such that max i -=i j ... ii /2(.Sj - s,-_i) < A,. For i = 1, . . . , t/2, 
define t auxiliary random variables as follows, 

Uli-\ '■= l[«(.S i ),l]Xy3(i,_ 1 ),l](V V , Vy), Uli := l[Q-(j,),l]x08(i i ),l](V r x , Vy). 




Fig. 5. (a) 4-message interactive code (b) oo-message interactive code (c) oo-message interactive code for < p < q < 1 /2 with rate-allocation curve Y\ (d) 
oo-message interactive code forO < q < p < 1/2 with rate-allocation curve ¥2- 

In Fig. HJa), (V x , V y ) is uniformly distributed on the unit square and U' are defined to be 1 in rectangular regions which 
are nested. The following properties can be verified: 

PI: Ui>U 2 >...> U t . 

P2: pw\XY £ 'Pemj, or equivalently, H(X A Y\Y, U') = 0: since U, - l[i-p,i]x[8(i),i](V^, V y ) and Y = l[\- q ,i](V y ). Therefore 
U, A Y — lri- A i]x[w,i](V„ V y ) = X A Y. 

P3: pu>\xy € fm Ct , which can be equivalently written as Markov chain conditions: for example, consider U2, -{Y, U 2 '~ l )-X. 
U 2i -i = => U 2i = and the Markov chain holds. U 2i -i = Y = 1 => (V A , V y ) 6 1] x [1 - 4, 1] => C/ 2 i = 1 

and the Markov chain holds. Given Un-\ = 1, Y = 0, (V^, V v ) ~ Uniform([a(5,), 1] x |j6(5,_i), 1 - ^]) => V v and V v are 
conditionally independent. Thus X 1L UjAiUn-i = 1, F = 0) because X is a function of only V x and 1/2; is a function of 
only V v upon conditioning. So the Markov chain U21 - {Y, U 2 '~ l ) -X holds in all situations. 

PA: (Y, U21) -U- X\U2i-i = 1: this can be proved by the same method as in P3. 

P2 and P3 show that p u>lX Y e 'Pfipxr)- 



For i = 1, . . . , t/2, the (2;)-th rate is given by 

I{Y-U 2i \X,U 2i - 1 ) 



P4 



/(7; U 2i \X, U 2i -i = l)pu^W 
I(Y;U 2i \U 2 i-i = 1)^^,(1) 

HOWa-i = Dpt/^, (1) - //(r|t/ 2 „ f/2/-i = (1) 
ff(7|I/ 2i -i = 1)^,(1) -H(F|C/ 2i = l)p U2i (l) 
(1 -«(*,)) [(1 -P(si-i))h 2 1 



9 



1 -i8(s/-i) 



-(1 -yS(ii))/! 



4 



1 ->3(*i) 



= (1 - a(SiJ) log 2 1 - 1 dv y 



ff 

JJ[a( 



Wy(v y , q)dv x dv y , 



where step (a) is due to property PA and because (Un-\,U 2 i) = (1,0) => Y = 0, hence H(Y\U 2 i, U 2 i-\ = l)pun-i(^) - 
H(Y\U 2i = 1, C/ 2 /-i = l^p^d, 1) = H{Y\U 2i = l)p U2 ,(l), and step (b) is because 

^ (-(1 - v v )/ J2 (^-j) = log 2 ( T l^_) =: w v (v, 

The 2;-th rate can thus be expressed as a 2-D integral of a weight function w y over the rectangular region fieg(2i) := 
[a^s;), 1] x [/3{si-\),f3{si)] (a horizontal bar in Fig. |3a)). Therefore, the sum of rates of all messages sent from terminal B 
to terminal A is the integral of w y over the union of all the corresponding horizontal bars in Fig. |3Ja). Similarly, the sum 
of rates of all messages sent from terminal A to terminal B can be expressed as the integral of another weight function 
w x (.v x ,p) '■= log 2 ((l - v x )/(l - p - \'x)) over the union of all the vertical bars in Fig. |5ja). 

Now let t — > oo such that A, — > 0. Since a and B are absolutely continuous, (a(s,)-a(s,'_i)) — > and (J3(si) -y3(s,-_i)) — » 0. 
The union of the horizontal (resp. vertical bars) in Fig. |5ja) tends to the region "W y (resp. 1VJ in Fig. [2b). Hence an 
achievable infinite-message sum-rate given by 

II w x (v x , p)dv x dv y + \ \ Wy{Vy,q)dv x dv y (XI) 

depends on only the rate-allocation curve Y which coordinates the progress of source descriptions at A and B. When 
< p < q < 1/2, choose F = Ti to be the piecewise linear curve connecting (0,0), (l - p/q,0), (l -2/7, l -2^,(1 —p, 1 -2q) 
in that order(see Fig. |5jc)). When < q < p < 1/2, choose r = T 2 to be the piecewise linear curve connecting (0, 0), (0, 1 - 
q/p), (1 - 2/j, 1 - 2q), (1 - p, 1 - 2q) in that order(see Fig. EJd)). For these two choices of the rate-allocation curve, dill ) can 
be evaluated in closed form and is given by the expressions in the first two cases of ( TTTT i, which completes the proof. 

Remark 7: The two curves Fi and F2 were specifically chosen to minimize the value of ( 11.11 ). Although the minimization 
steps are nontrivial, they are omitted because the achievability of R* does not rely on them. 
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